Skip to content

Conversation

@ServeurpersoCom
Copy link
Collaborator

Introduce OpenAI-compatible model selector in JSON payload

This PR adds a minimal model selector to the WebUI sidebar, allowing users to pick an available model exposed through the /v1/models OpenAI-compatible endpoint

The selector automatically fetches and lists models from the server, persists the selected model in local storage, and sends it in the JSON body of subsequent /v1/chat/completions requests. The selection logic mirrors OpenAI’s client behavior while remaining fully offline-compatible with local llama.cpp instances

This enables direct interoperability with OpenAI-compatible clients and simplifies multi-model setups in the WebUI

Restore OpenAI-Compatible model source of truth and unify metadata capture :

This change re-establishes a single, reliable source of truth for the active model:
fully aligned with the OpenAI-Compat API behavior

It introduces a unified metadata flow that captures the model field from both
streaming and non-streaming responses, wiring a new onModel callback through ChatService
The model name is now resolved directly from the API payload rather than relying on
server /props or UI assumptions

ChatStore records and persists the resolved model for each assistant message during
streaming, ensuring consistency across the UI and database
Type definitions for API and settings were also extended to include model metadata
and the onModel callback, completing the alignment with OpenAI-Compat semantics

Remaining '/props' usage audit in the WebUI :

A repository-wide search inside 'tools/server/webui' shows the remaining '/props' references are intentional because the WebUI still needs to bootstrap and validate server capabilities outside of chat responses:

  • 'src/routes/+layout.svelte' and 'src/lib/stores/server.svelte.ts' fetch '/props' on application startup to populate the global server store with template, model alias, and capability metadata that never appears in chat completions.
  • 'src/lib/components/app/server/ServerErrorSplash.svelte' and 'src/lib/components/app/chat/ChatScreen/ChatScreenWarning.svelte' surface fallback UI when '/props' is unreachable, ensuring the user understands cached data might be stale.
  • 'src/lib/utils/api-key-validation.ts' validates API keys against '/props' so that the UI can warn about incompatible keys before issuing chat requests.
  • 'src/lib/services/chat.ts' performs a last-resort fetch to '/props' when the streaming handshake fails, preserving compatibility with legacy servers that only expose model names via that endpoint.

@ServeurpersoCom
Copy link
Collaborator Author

TL;DR:
Adds a lightweight model selector for the WebUI using the /v1/models OpenAI-compatible endpoint.
Selected models are persisted locally and included in chat request payloads (model field).
Also unifies model metadata capture during streaming and non-streaming responses : the WebUI now uses a single source of truth for the active model across the stack.

@ServeurpersoCom
Copy link
Collaborator Author

ServeurpersoCom commented Oct 13, 2025

@ngxson :) What do you think about this approach ?

  • aiming to stay compatible with the current standalone llama-server,
  • llama-swap
  • and future multi-model evolutions of llama-server?

It introduces a unified, KISS, OpenAI-compatible model selection path while keeping everything backward-compatible with existing setups

A standalone llama-server on a Raspberry Pi 5 :
Sans titre
I'll have to filter the model path here too (?)

@ServeurpersoCom
Copy link
Collaborator Author

@allozaur mind taking a look at those default Svelte arrows and the scrolling manager? I figured your Svelte wizardry might know the cleanest way to get rid of them 😄 I like things to be pixel-perfect, but it looks like this is built into the framework : and I’d rather not bypass Svelte just for that.
SvelteArrow

@allozaur
Copy link
Collaborator

@allozaur mind taking a look at those default Svelte arrows and the scrolling manager? I figured your Svelte wizardry might know the cleanest way to get rid of them 😄 I like things to be pixel-perfect, but it looks like this is built into the framework : and I’d rather not bypass Svelte just for that.

SvelteArrow

Yep, will take a look at that and come back to u with an answer 😉

@ServeurpersoCom ServeurpersoCom force-pushed the openai-model-selector branch 2 times, most recently from 45298f8 to 286ca88 Compare October 20, 2025 06:07
@ServeurpersoCom
Copy link
Collaborator Author

Extracted determineInitialSelection helper
Centralized localStorage key constant
Minor Toaster cleanup per review
All checks passing locally

@ServeurpersoCom
Copy link
Collaborator Author

I think that placing the model selector in the Sidebar makes its UI a bit too heavy and bloated... Much better place for changing the model woud be in Chat Form, like here:

Computer :
PC1
PC2
PC3

Smartphone :
Mobile1 Mobile2 Mobile3

That’s actually a great idea : moving the selector into the Chat Form feels way more natural now that I’ve tested it 😄
The layout is cleaner, lighter, and it fits perfectly in the conversation flow. Definitely the right spot!

1324a4d

@allozaur
Copy link
Collaborator

I think that placing the model selector in the Sidebar makes its UI a bit too heavy and bloated... Much better place for changing the model woud be in Chat Form, like here:

Computer :

PC1 PC2 PC3

Smartphone :

Mobile1 Mobile2 Mobile3

That’s actually a great idea : moving the selector into the Chat Form feels way more natural now that I’ve tested it 😄

The layout is cleaner, lighter, and it fits perfectly in the conversation flow. Definitely the right spot!

1324a4d

I will post a PR to improve the UI of this selector as now it's just taking too much space and looks a bit off 😜

@ServeurpersoCom
Copy link
Collaborator Author

ServeurpersoCom commented Oct 20, 2025

I will post a PR to improve the UI of this selector as now it's just taking too much space and looks a bit off 😜

Got it : you want to fine-tune the layout so the selector sits closer to the mic button, probably with a max width to keep long model names from stretching the form. We could even take it a step further : replace the full selector with a small model icon or dropdown button that pops the model list on click.

But we should still make sure the currently selected model (the one actually sent in the request) is clearly visible somewhere: since the /props display above only shows the model currently loaded on the llama-server, not the one chosen by the user through the selector. And on mobile there’s already very little screen space to work with, so keeping it minimal while still informative would be ideal. Actually, the small checkmark in the dropdown list might already be enough to show the active model, though in that case, we’d probably need to fix the scrolling glitch in the framework so the menu behaves properly.

@ServeurpersoCom
Copy link
Collaborator Author

Like this ?

Sans titre
(root|~/llama.cpp.pascal) git diff 0b9aaf8fe8fb36817c74f25e905aedaccfa9a825
diff --git a/tools/server/public/index.html.gz b/tools/server/public/index.html.gz
index c76f5778b..1897d50c8 100644
Binary files a/tools/server/public/index.html.gz and b/tools/server/public/index.html.gz differ
diff --git a/tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormActions.svelte b/tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormActions.svelte
index cbf0385a4..1bdb7f947 100644
--- a/tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormActions.svelte
+++ b/tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormActions.svelte
@@ -32,27 +32,29 @@
 <div class="flex w-full items-center gap-2 {className}">
        <ChatFormActionFileAttachments {disabled} {onFileUpload} />

-       <ChatFormModelSelector class="min-w-[140px] flex-1" />
+       <div class="ml-auto flex items-center gap-2">
+               <ChatFormModelSelector class="flex-shrink-0" />

-       {#if isLoading}
-               <Button
-                       type="button"
-                       onclick={onStop}
-                       class="h-8 w-8 bg-transparent p-0 hover:bg-destructive/20"
-               >
-                       <span class="sr-only">Stop</span>
-                       <Square class="h-8 w-8 fill-destructive stroke-destructive" />
-               </Button>
-       {:else}
-               <ChatFormActionRecord {disabled} {isLoading} {isRecording} {onMicClick} />
+               {#if isLoading}
+                       <Button
+                               type="button"
+                               onclick={onStop}
+                               class="h-8 w-8 bg-transparent p-0 hover:bg-destructive/20"
+                       >
+                               <span class="sr-only">Stop</span>
+                               <Square class="h-8 w-8 fill-destructive stroke-destructive" />
+                       </Button>
+               {:else}
+                       <ChatFormActionRecord {disabled} {isLoading} {isRecording} {onMicClick} />

-               <Button
-                       type="submit"
-                       disabled={!canSend || disabled || isLoading}
-                       class="h-8 w-8 rounded-full p-0"
-               >
-                       <span class="sr-only">Send</span>
-                       <ArrowUp class="h-12 w-12" />
-               </Button>
-       {/if}
+                       <Button
+                               type="submit"
+                               disabled={!canSend || disabled || isLoading}
+                               class="h-8 w-8 rounded-full p-0"
+                       >
+                               <span class="sr-only">Send</span>
+                               <ArrowUp class="h-12 w-12" />
+                       </Button>
+               {/if}
+       </div>
 </div>
diff --git a/tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormModelSelector.svelte b/tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormModelSelector.svelte
index ca48285da..d54147a5e 100644
--- a/tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormModelSelector.svelte
+++ b/tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormModelSelector.svelte
@@ -1,6 +1,6 @@
 <script lang="ts">
        import { onMount } from 'svelte';
-       import { Loader2 } from '@lucide/svelte';
+       import { Bot, Loader2 } from '@lucide/svelte';
        import * as Select from '$lib/components/ui/select';
        import { cn } from '$lib/components/ui/utils';
        import {
@@ -63,7 +63,7 @@
        }
 </script>

-<div class={cn('flex min-w-0 flex-col gap-1', className)}>
+<div class={cn('flex min-w-0 flex-col items-center gap-1', className)}>
        {#if loading && options.length === 0 && !isMounted}
                <div class="flex items-center gap-2 text-xs text-muted-foreground">
                        <Loader2 class="h-4 w-4 animate-spin" />
@@ -81,13 +81,16 @@
                        disabled={loading || updating}
                >
                        <Select.Trigger
-                               class="h-9 w-full min-w-[140px] justify-between rounded-full border border-input bg-background/80 px-3 text-left text-sm sm:min-w-[200px]"
+                               aria-label="Select model"
+                               title={selectedOption?.name || 'Select model'}
+                               class="flex !h-8 !w-8 items-center justify-center !rounded-full border border-input !bg-background/80 !p-0 text-muted-foreground transition-colors hover:!bg-background data-[state=open]:!bg-background [&>svg:last-child]:hidden"
                        >
-                               <span class="truncate font-medium">{selectedOption?.name || 'Select model'}</span>
-
                                {#if updating}
                                        <Loader2 class="h-4 w-4 animate-spin text-muted-foreground" />
+                               {:else}
+                                       <Bot class="h-4 w-4" />
                                {/if}
+                               <span class="sr-only">{selectedOption?.name || 'Select model'}</span>
                        </Select.Trigger>

                        <Select.Content class="z-[100000]">
@@ -105,6 +108,6 @@
        {/if}

        {#if error}
-               <p class="text-xs text-destructive">{error}</p>
+               <p class="text-center text-xs text-destructive">{error}</p>
        {/if}
 </div>
(root|~/llama.cpp.pascal)

@allozaur
Copy link
Collaborator

allozaur commented Oct 20, 2025

Like this?

@ServeurpersoCom i thought more of sth similiar to what Claude has:

Zrzut ekranu 2025-10-20 o 10 38 04

@ServeurpersoCom
Copy link
Collaborator Author

rebase master

@ServeurpersoCom
Copy link
Collaborator Author

Sans titre

@ServeurpersoCom
Copy link
Collaborator Author

ServeurpersoCom commented Oct 20, 2025

No more Flowbite scroll bug on mobile :
Sans titre

I'll add a "developer" option with the model selector hidden by default.
But we can force it by default for all new users in lib/constants/settings-config.ts.
That's important so users can share decentralized AI services with friends, like I'm doing now,
otherwise it would be broken for anyone visiting the page for the first time!

@ServeurpersoCom
Copy link
Collaborator Author

ServeurpersoCom commented Oct 20, 2025

Sans titre Done. Now it’s compliant for all use cases @allozaur

@ServeurpersoCom
Copy link
Collaborator Author

@allozaur
Copy link
Collaborator

allozaur commented Oct 20, 2025

@ServeurpersoCom

I've tested this on my end simply by running:

llama-server -hf ggml-org/Qwen2.5-Omni-7B-GGUF -c 0 --jinja --parallel 5

and

npm run dev

and I've spotted a few issues:

  1. When i have the Enable model selector option unchecked, the default model should not be gpt-3.5-turbo but the one that i am actually running with llama-server

On the main screen i see the currently loaded model:

Zrzut ekranu 2025-10-21 o 01 02 14

But under the message I am seeing this in the model info:

Zrzut ekranu 2025-10-21 o 01 00 23
  1. Also, after enabling the model selector in Settings, i am seeing full path to the model file instead of just the model file in the message model info.
Zrzut ekranu 2025-10-21 o 01 04 05
  1. Besides that the selector value string is good, but when the name is long, the UI breaks:
Zrzut ekranu 2025-10-21 o 01 04 10

@ServeurpersoCom
Copy link
Collaborator Author

1. When i have the `Enable model selector` option unchecked, the default model should not be `gpt-3.5-turbo` but the one that i am actually running with `llama-server`

Oops, that's exactly what the backend does : we need to display the captured chunks ! and refactor the backend accordingly!
Sans titre

ServeurpersoCom and others added 19 commits October 22, 2025 13:00
Normalized streamed model names during chat updates
by trimming input and removing directory components before saving
or persisting them, so the conversation UI shows only the filename

Forced model names within the chat form selector dropdown to render as
a single-line, truncated entry with a tooltip revealing the full name
When the selector is disabled, it falls back to the active server model name from /props

When the model selector is enabled, the displayed model comes from the message metadata
(the one explicitly selected and sent in the request)
…rmModelSelector.svelte

Co-authored-by: Aleksander Grygier <[email protected]>
…atMessageAssistant.svelte

Co-authored-by: Aleksander Grygier <[email protected]>
- Replace inline portal and event listeners with proper Svelte bindings
- Introduce 'persisted' store helper for localStorage sync without runes
- Extract 'normalizeModelName' utils + Vitest coverage
- Simplify ChatFormModelSelector structure and cleanup logic

Replaced the persisted store helper's use of '$state/$effect' runes with
a plain TS implementation to prevent orphaned effect runtime errors
outside component context

Co-authored-by: Aleksander Grygier <[email protected]>
…rmModelSelector.svelte

Co-authored-by: Aleksander Grygier <[email protected]>
Copy link
Collaborator

@allozaur allozaur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also please make this small change ;)

Copy link
Collaborator

@allozaur allozaur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, i think that we are good to go :)

@allozaur allozaur merged commit 9b9201f into ggml-org:master Oct 22, 2025
14 checks passed
FMayran pushed a commit to FMayran/llama.cpp that referenced this pull request Oct 23, 2025
…ml-org#16562)

* webui: introduce OpenAI-compatible model selector in JSON payload

* webui: restore OpenAI-Compatible model source of truth and unify metadata capture

This change re-establishes a single, reliable source of truth for the active model:
fully aligned with the OpenAI-Compat API behavior

It introduces a unified metadata flow that captures the model field from both
streaming and non-streaming responses, wiring a new onModel callback through ChatService
The model name is now resolved directly from the API payload rather than relying on
server /props or UI assumptions

ChatStore records and persists the resolved model for each assistant message during
streaming, ensuring consistency across the UI and database
Type definitions for API and settings were also extended to include model metadata
and the onModel callback, completing the alignment with OpenAI-Compat semantics

* webui: address review feedback from allozaur

* webui: move model selector into ChatForm (idea by @allozaur)

* webui: make model selector more subtle and integrated into ChatForm

* webui: replaced the Flowbite selector with a native Svelte dropdown

* webui: add developer setting to toggle the chat model selector

* webui: address review feedback from allozaur

Normalized streamed model names during chat updates
by trimming input and removing directory components before saving
or persisting them, so the conversation UI shows only the filename

Forced model names within the chat form selector dropdown to render as
a single-line, truncated entry with a tooltip revealing the full name

* webui: toggle displayed model source for legacy vs OpenAI-Compat modes

When the selector is disabled, it falls back to the active server model name from /props

When the model selector is enabled, the displayed model comes from the message metadata
(the one explicitly selected and sent in the request)

* Update tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormActions.svelte

Co-authored-by: Aleksander Grygier <[email protected]>

* Update tools/server/webui/src/lib/constants/localstorage-keys.ts

Co-authored-by: Aleksander Grygier <[email protected]>

* Update tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormModelSelector.svelte

Co-authored-by: Aleksander Grygier <[email protected]>

* Update tools/server/webui/src/lib/components/app/chat/ChatMessages/ChatMessageAssistant.svelte

Co-authored-by: Aleksander Grygier <[email protected]>

* Update tools/server/webui/src/lib/services/chat.ts

Co-authored-by: Aleksander Grygier <[email protected]>

* Update tools/server/webui/src/lib/services/chat.ts

Co-authored-by: Aleksander Grygier <[email protected]>

* webui: refactor model selector and persistence helpers

- Replace inline portal and event listeners with proper Svelte bindings
- Introduce 'persisted' store helper for localStorage sync without runes
- Extract 'normalizeModelName' utils + Vitest coverage
- Simplify ChatFormModelSelector structure and cleanup logic

Replaced the persisted store helper's use of '$state/$effect' runes with
a plain TS implementation to prevent orphaned effect runtime errors
outside component context

Co-authored-by: Aleksander Grygier <[email protected]>

* webui: document normalizeModelName usage with inline examples

* Update tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormModelSelector.svelte

Co-authored-by: Aleksander Grygier <[email protected]>

* Update tools/server/webui/src/lib/stores/models.svelte.ts

Co-authored-by: Aleksander Grygier <[email protected]>

* Update tools/server/webui/src/lib/stores/models.svelte.ts

Co-authored-by: Aleksander Grygier <[email protected]>

* webui: extract ModelOption type into dedicated models.d.ts

Co-authored-by: Aleksander Grygier <[email protected]>

* webui: refine ChatMessageAssistant displayedModel source logic

* webui: stabilize dropdown, simplify model extraction, and init assistant model field

* chore: update webui static build

* Update tools/server/webui/src/lib/components/app/chat/ChatMessages/ChatMessageAssistant.svelte

Co-authored-by: Aleksander Grygier <[email protected]>

* chore: npm format, update webui static build

* webui: align sidebar trigger position, remove z-index glitch

* chore: update webui build output

---------

Co-authored-by: Aleksander Grygier <[email protected]>
pwilkin pushed a commit to pwilkin/llama.cpp that referenced this pull request Oct 23, 2025
…ml-org#16562)

* webui: introduce OpenAI-compatible model selector in JSON payload

* webui: restore OpenAI-Compatible model source of truth and unify metadata capture

This change re-establishes a single, reliable source of truth for the active model:
fully aligned with the OpenAI-Compat API behavior

It introduces a unified metadata flow that captures the model field from both
streaming and non-streaming responses, wiring a new onModel callback through ChatService
The model name is now resolved directly from the API payload rather than relying on
server /props or UI assumptions

ChatStore records and persists the resolved model for each assistant message during
streaming, ensuring consistency across the UI and database
Type definitions for API and settings were also extended to include model metadata
and the onModel callback, completing the alignment with OpenAI-Compat semantics

* webui: address review feedback from allozaur

* webui: move model selector into ChatForm (idea by @allozaur)

* webui: make model selector more subtle and integrated into ChatForm

* webui: replaced the Flowbite selector with a native Svelte dropdown

* webui: add developer setting to toggle the chat model selector

* webui: address review feedback from allozaur

Normalized streamed model names during chat updates
by trimming input and removing directory components before saving
or persisting them, so the conversation UI shows only the filename

Forced model names within the chat form selector dropdown to render as
a single-line, truncated entry with a tooltip revealing the full name

* webui: toggle displayed model source for legacy vs OpenAI-Compat modes

When the selector is disabled, it falls back to the active server model name from /props

When the model selector is enabled, the displayed model comes from the message metadata
(the one explicitly selected and sent in the request)

* Update tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormActions.svelte

Co-authored-by: Aleksander Grygier <[email protected]>

* Update tools/server/webui/src/lib/constants/localstorage-keys.ts

Co-authored-by: Aleksander Grygier <[email protected]>

* Update tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormModelSelector.svelte

Co-authored-by: Aleksander Grygier <[email protected]>

* Update tools/server/webui/src/lib/components/app/chat/ChatMessages/ChatMessageAssistant.svelte

Co-authored-by: Aleksander Grygier <[email protected]>

* Update tools/server/webui/src/lib/services/chat.ts

Co-authored-by: Aleksander Grygier <[email protected]>

* Update tools/server/webui/src/lib/services/chat.ts

Co-authored-by: Aleksander Grygier <[email protected]>

* webui: refactor model selector and persistence helpers

- Replace inline portal and event listeners with proper Svelte bindings
- Introduce 'persisted' store helper for localStorage sync without runes
- Extract 'normalizeModelName' utils + Vitest coverage
- Simplify ChatFormModelSelector structure and cleanup logic

Replaced the persisted store helper's use of '$state/$effect' runes with
a plain TS implementation to prevent orphaned effect runtime errors
outside component context

Co-authored-by: Aleksander Grygier <[email protected]>

* webui: document normalizeModelName usage with inline examples

* Update tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormModelSelector.svelte

Co-authored-by: Aleksander Grygier <[email protected]>

* Update tools/server/webui/src/lib/stores/models.svelte.ts

Co-authored-by: Aleksander Grygier <[email protected]>

* Update tools/server/webui/src/lib/stores/models.svelte.ts

Co-authored-by: Aleksander Grygier <[email protected]>

* webui: extract ModelOption type into dedicated models.d.ts

Co-authored-by: Aleksander Grygier <[email protected]>

* webui: refine ChatMessageAssistant displayedModel source logic

* webui: stabilize dropdown, simplify model extraction, and init assistant model field

* chore: update webui static build

* Update tools/server/webui/src/lib/components/app/chat/ChatMessages/ChatMessageAssistant.svelte

Co-authored-by: Aleksander Grygier <[email protected]>

* chore: npm format, update webui static build

* webui: align sidebar trigger position, remove z-index glitch

* chore: update webui build output

---------

Co-authored-by: Aleksander Grygier <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants